-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
feat: retry on DNS queries. #1180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
c44e405 to
b324723
Compare
|
Tested today, it does improve the situation. However, I would advice to also fail the process early here if a DNS error still happen despite the retry. |
This not needed because the DNSError don't produce a retry and by example a timeout can have several reasons. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not quite sure what the expected behaviour for --dns-timeout ought to be. If I provide --dns-timeout=30, I'd expect the SOA query to finish within 30s (give or take a few seconds). Please correct me if I'm wrong, but I believe the query now might take up to 2 minutes.
| bo.InitialInterval = dnsTimeout | ||
| bo.MaxInterval = 2 * bo.InitialInterval | ||
| bo.MaxElapsedTime = 4 * bo.InitialInterval |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sets the max. wait time to 2 minutes if I invoke lego with --dns-timeout=30, doesn't it? Should bo.InitialInterval be dnsTimeout/4 instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The initial interval must be the same as the timeout use by the DNS client.
The dnsTimeout cannot be used as MaxElapsedTime because it can be too short with the current value.
Currently dnsTimeout is the timeout for one DNS query.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The initial interval must be the same as the timeout use by the DNS client
I see. Then the documentation of the --dns-timeout flag needs to be modified to something like "this is the minimum timeout given to resolve SOA queries; on network errors the actual timeout might be considerably longer".
Alternatively, we can reduce the default timeout down to something like 2 or 3s to approximate the previous behaviour (on that note, this should work in most cases, because DNS usually resolves fast enough). To accommodate #1008, we could also bump bo.MaxInterval and bo.MaxElapsedTime to 10*initialInterval and 20*initialInterval respectively.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can reduce the default timeout.
But I think that multiply by 10 or 20 is too big: the dnsQuery is used by several functions in a loop (by example a loop on ns) and this can become very very long.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I've missed the context; I thought this was part of the fetchSoaByFqdn methods. Sorry for the confusion on my part.
| bo.InitialInterval = dnsTimeout | ||
| bo.MaxInterval = 2 * bo.InitialInterval | ||
| bo.MaxElapsedTime = 4 * bo.InitialInterval |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I've missed the context; I thought this was part of the fetchSoaByFqdn methods. Sorry for the confusion on my part.
|
I am wondering about the impact and side effects of this PR. I need to take a little more time to think. |
Fixes #1008